Surgical Results and Complications for Open, Laparoscopic, and Robot-assisted Radical Prostatectomy: A Reverse Systematic Review

Take Home Message Our review revealed a significantly higher annual volume of surgery per surgeon (AVSS), a higher percentage of low-risk patients, and lower rates of lymphadenectomy and complications for robot-assisted radical prostatectomy (RARP) compared to open and laparoscopic techniques. RARP showed better performance for all the perioperative variables studied except for operative time. Among all the outcomes, only AVSS was significantly correlated with complication rates. The AVSS required to achieve a target complication rate was significantly lower for RARP.


Introduction
The advantages of minimally invasive surgery in the surgical treatment of prostatic carcinoma are well known and the European Association of Urology and American Urological Association guidelines recognize these advantages and recommend the minimally invasive route owing to better perioperative results in terms of bleeding and transfusion rates, length of hospital stay, and complications [1,2].
During the contemporary history of radical prostatectomy (RP), the three main techniques-retropubic RP (RRP), laparoscopic RP (LRP), and robot-assisted (RARP)have been compared in several studies with different levels of evidence, ranging from expert opinions to systematic reviews (SRs) [3].
SR with meta-analysis is an excellent tool for bringing together methodologically similar studies in order to increase the number of patients and thus the statistical strength of comparisons. However, during the process of choosing these studies, a lot of information can be excluded, leading to a very specific clinical scenario that is often unrepresentative of real life [4].
Thus, our study group designed a new SR methodology called reverse SR (RSR) to compare the three RP techniques [5] and to generate a heterogeneous population database that covers several different scenarios. Here we used RSR to understand how perioperative variables and complication rates have evolved over the 20 yr for which the three techniques have coexisted and to explore correlation with possible bias and confounding that may have influenced the results and trends.

Description of the methodology
In classic SR, a systematic search is performed in databases to locate original clinical studies that answered a specific question. After this search, studies that are homogeneous and comparable-that is, studies that used the same methods, populations, and outcomes-are selected for inclusion and can be merged for statistical analysis, called a metaanalysis [6,7]. In the case of RSR, the opposite path is followed. The literature search is carried out with the objective of identifying all SRs in the history of the technique under study, regardless of the question of interest, and gathering as many of these studies possible to generate a heterogeneous population with complete information for the outcomes that most interested the research community in that area. At this stage, when gathering all the SRs, the main focus is to capture all the studies included in these reviews that were used to answer the investigators' questions (Supplementary material).

Search methodology and study design
In December 2020, a literature search was carried out using eight databases: PubMed, Web of Science, Cochrane Library, Embase, ProQuest, CINAHL (The Cumulative Index to Nursing and Allied Health Literature), VHL/Bireme, and Scopus ( Fig. 1). We searched for SRs, with or without metaanalysis, that addressed the techniques of RRP, LRP, and RARP, with a general strategy based on health descriptors and synonyms referring to the terms ''Laparoscopy'', ''Open'', ''Retropubic'', ''Prostatectomy'', ''Robotic Surgical Procedures'', ''Systematic Review'', and ''Meta-analysis'' in the ''Title, Abstract and Subject'' fields. We then applied the limiters ''humans'', gender (''male''), language (''English''), and type of study (''Systematic Review''). The period in the literature was between January 1, 2000 and December 5, 2020. For each database, and adaptation of the search methodology necessary was carried out (Supplementary material).
After the reviews were identified in the initial search, two researchers (T.B.C.M. and L.O.R.) independently selected reviews that included at least one of the three RP techniques. After the initial screening, the full texts were analyzed and any discrepancies were resolved after open discussion between the authors. Reviews without systematization of the search or integrative methodology, conference and congress abstracts, and papers on other techniques were excluded.
Owing to the difficulty in standardizing health descriptors (MeSH terms) for the databases and classifying a study as an SR, we included studies that, despite not mentioning if the PRISMA criteria [4] were followed in their methodology, provided a clear description of the systematization of the search criteria.
Once all the SRs were chosen, the next step was to extract all the articles cited in the bibliographic references that were included in these SRs for analysis. Publications that were abstracts, meeting reports, or congress proceedings were excluded. As before, two researchers separately reviewed the studies (T.B.C.M. and L.O.R.) and discrepancies in selection were resolved via open discussion.
After the sample was chosen via the systematized method described, all the studies were analyzed by the main author (T.B.C.M.) and the largest amount of data available was captured and tabulated in a dedicated Excel spreadsheet.
When a study evaluated more than one cohort, each one was considered as an isolated study and was called a report, which is the unit of publication used in the study.
The global content from all of the studies selected, including bibliographic, demographic, and clinicosurgical variables, was used to generate a reference population database for various studies and applications.

Variables analyzed and comparative methods
For this study, perioperative variables separated into the three groups (RRP, LRP, and RARP) were analyzed, including: number of patients, annual volume of surgeries per surgeon (AVSS), age (yr), body mass index (kg/m 2 ), initial prostate-specific antigen (PSA; mg/dl), Gleason score (mean and stratified), clinical T stage, operative time (min), estimated blood loss (EBL; ml), blood transfusion rate (%), length of hospital stay (d), bladder catheterization time (d), and overall and stratified perioperative complication rates (minor, major, and Clavien-Dindo grade I-V [8]).
The mean values for these variables were computed to calculate population reference values. A temporal analysis of the variable means was performed by dividing the reports into four periods in relation to year of publication: first period, before 2005; second period, 2006-2010; third period, 2011-2015; and fourth period, after 2015). In addition, a correlation analysis of the variables was performed to identify factors related to the complication rate.

Statistical analysis
The measure of central tendency was represented by the mean, and dispersion by the standard error of the mean. Comparisons between means were performed using the parametric analysis of variance (ANOVA) test, with multiple variables analyzed according to the homogeneity of the variance, as defined according to the Levene test (Tukey, Bonferroni, or Games-Howell correction). Correlation analyses of continuous variables were performed using Spearman's correlation. The regression curve was adjusted using the rational regression model (nonlinear). The significance level was set at p < 0.05 (two-tailed). Statistical analyses were performed in SPSS v.24. Curve Express Professional v2.7 was used for regression graphs and adjustments.

Results
The first stage of the systematic search for SRs on RP identified 634 studies in eight databases. After excluding 107 duplicates (17%) and 447 studies that did not meet the inclusion criteria, 80 SRs were included in the second stage (Supplementary material).
In the second stage, all selected SRs were read and the primary studies used were captured, resulting in a total of 2356 citations. After excluding 1172 (49.7%) duplicates and 274 studies that did not meet the inclusion criteria, 910 studies were selected for the global database (Supplementary material). Owing to the existence of more than one cohort in some studies, each cohort was considered separately, resulting in 1724 publication units or reports. Separated by technique, 559 (32.4%) reports on RRP, 413 (23.9%) reports on LRP, and 752 (43.7%) reports on RARP were included (Fig. 1).
Descriptive and comparative statistics for preoperative clinical characteristics for the three technique groups are listed in Table 1.
The mean number of patients was significantly higher for the RRP studies (n = 1577) than for the other two techniques. The higher number of patients in a shorter time led to a higher mean AVSS for RARP (64.29) than for RRP (43.26) and LRP (41.47). For the initial staging variables (PSA, Gleason score, and clinical T stage), there were significant differences between the RARP group and the other two techniques, with a higher percentage of low-risk patients (PSA <10 ng/ml, Gleason <7, stage <cT2) undergoing RARP, and a similar profile between RRP and LRP (Table 1).
Descriptive and comparative statistics for perioperative variables for the three groups are listed in Table 2. Analysis of perioperative variables revealed statistically significant differences among the three techniques for almost all the comparisons, as visualized in Figure 2. Robotic surgery showed better performance for all of the variables studied, except for operative time, which was shortest with the open approach.
Temporal analysis revealed that in the first period (before 2005), RRP and LRP were the techniques most used for patients with lower risk (PSA <10 ng/l, Gleason <7, stage T1-2), with better perioperative results obtained with LRP. For the second period there was a change in the case pattern for RARP, with a greater proportion of patients at low risk undergoing surgery via this approach and a gradual improvement in perioperative results over time until the fourth period, when discrepancies become less evident. Regardless of the period, the best perioperative results, except for operative time, were observed for RARP (Table 3).
After simple correlation analysis among the variables studied, only AVSS was significantly correlated with the overall complication rate among the techniques (Table 4).
After nonlinear regression using the rational model, correlations were adjusted to allow prediction of complication rates by AVSS (Table 5). Using this model, AVSS simulation was performed based on the best average AVSS result among the techniques, which was for RARP, with a complication rate of 12.3% for an AVSS of 30.15 surgeries. For RRP, it took 95.33 surgeries/yr per surgeon to achieve a complication rate of 12.3%, and a similar AVSS of 95.41 surgeries/yr for LRP to achieve a complication rate of 12.8% (Fig. 3).

Discussion
In the current study we applied a new methodology called RSR to capture evidence used in SRs over the 20-yr history   To date, the largest SR comparing intraoperative and perioperative complications among the three PR techniques was performed by Tewari et al [10] in 2012, which included 39 cohorts for RRP, 57 for LRP, and 42 for RARP. The total intraoperative complication rate was significantly higher for RRP (1.5%) versus RARP (0.4%; p < 0.0001) and for LRP (1.6%) versus RARP (0.4%; p < 0.0001). There were also significant differences in the total perioperative complication rate for RARP (7.8%) versus RRP (17.9%; p < 0.0001) and versus LRP (11.1%; p = 0.002). By comparison, we gathered a significantly higher number of reports, with 148 for RRP, 243 for LRP, and 282 for RARP. The increase in the number of reports generated by our approach yielded worse results than those reported by Tewari et al, with overall rates of perioperative complications of 20.2%, 16.3%, and 12.3% for RRP, LRP, and RARP, with significant differences for all three pairwise comparisons (Table 2).      Temporal analysis over the four periods showed that in the first 5 yr after the emergence of minimally invasive surgery, RARP was not used for simpler and low-risk cases, since these patients were more frequent in LRP and RRP study cohorts (Table 3). At this stage, perioperative results with robotic surgery were not very different from those with the other techniques. In the second period, more studies involving patients with low risk undergoing RARP were published, with a consistent improvement in results up to the fourth period. In an analysis of publications carried out by our group [11], we found that the peak for publications on robotic surgery occurred in 2010, between the second and third periods (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015), demonstrating the effort of the scientific community to consolidate RARP as the gold standard [12].
Comparison of the data from our methodology with prior knowledge from the literature reveals that RSR generated worse results for all outcomes, which probably reflects one of the main characteristics of this method. The fact that simple SRs show better results may be because of the need to apply strict criteria for inclusion of studies in the analyses. When several studies are included and a heterogeneous sample of population character is generated, the central limit theorem instantly increases in strength and generates a narrow standard error of the mean, increasing the precision of the population mean, which is then more representative of the real world. Many readers accustomed to the methodological and Cartesian rigor of classical SRs may see this effect as a selection bias; however, other readers who live in a practical real-life world may identify more with RSR results, since these encompass several scenarios that can be extrapolated to daily practice, and the precision and homogeneity of classic SR can be seen as a bias in the same sense.
Our study revealed interesting data regarding the annual volume of surgeries that a surgeon needs to perform to obtain a complication rate similar to the average rate for RARP. To achieve a mean rate of 12.3% for overall complications, a surgeon needs to perform 95 surgeries/yr for RRP and LRP, in contrast to 30 surgeries/yr for RARP. This finding can be interpreted in two ways. A first, more superficial interpretation leads to the conclusion that the RARP learning curve is shorter, as the best complication rate among the three techniques is achieved with a lower frequency of surgeries (average of 2.5 surgeries/mo). However, a second, more in-depth analysis may identify an important selection bias in RARP studies. Considering that the average volume of annual surgeries for RARP is 64, compared to 43 for RRP and 41 for LRP, it is evident that RARP procedures are carried out by surgeons who perform a greater volume of surgeries and are therefore more experienced. In addition, studies on RARP included patients with lower-risk disease, with a higher proportion of patients with PSA <10 mg/ dl, Gleason <7, and stage <cT2 (Table 1); in addition, this cohort had a lower rate of lymphadenectomy, which adds complications in the postoperative period. Corroborating this expectation, there was a higher rate of neurovascular bundle preservation for RARP, which is usually performed in patients with lower oncological risk.
In a population study performed before the minimally invasive era, Hu et al [13] analyzed the rates of in-hospital complications for 2292 patients undergoing RRP between 1997 and 1998 in 1210 hospitals using Medicare data. The authors found a complication rate of 21.9% for lowvolume (<40 surgeries/yr) and 11.8% for high-volume surgeons (!40 surgeries/yr), with the latter similar to the rates described for RARP in our study.  The main limitation of our study is inherent to the methodology itself. The fact that the RSR includes all the studies from the SRs, regardless of the inclusion criteria, means that the sample is composed of studies that differ in quality and design. This generates a sample space with as many biases as possible until a population sample that is representative of different clinical scenarios is reached. However, this limitation is purposeful in order to allow readers to understand the power of the population sample and bring the literature data closer to ''real-world'' findings, since the considerable increase in sample size increases the precision of the population mean according to the central limit theorem. If readers need to compare studies in a homogeneous way, there is already an established methodology that is powerful enough to give such answers-the classic SR with meta-analysis-but with scenarios that are often unrepresentative of the reality in practice for many urologists. The intent of our methodology is not to provide a contrast to data from classic SRs but rather to provide a view of the literature data from a different perspective. If urologists need specific answers, they will certainly find more precise information from SRs with meta-analysis. However, if there is a need for a broader and more representative perspective, the data from this study can be used for comparison of results, including a surgeon's own results.
Another limitation is related to the presence of weak correlations in the univariate analysis (r < 0.39), which indicates that other variables have an influence on the complication rates. This is a consequence of the heterogeneous sample, which is a potential point of criticism. However, because of the high degree of independence and the population nature of the sample, finding a significant correlation, even if weak, made it possible to perform an adjustment in the nonlinear regression with improved correlation and, mainly, with an established clinical logic.
In addition, the narrow standard error of the mean, generated by the population sample over a period of more than 20 yr, makes it statistically practically impossible to change these results, allowing the generation of new reference values to guide patients in the choice between RP techniques.

Conclusions
Our RSR, which included a wide real-life representative sample and reference values established in the literature, revealed that minimally invasive surgery had the best perioperative and complication results, especially RARP, which was associated with less complex cases, higher annual surgeon volume, and greater performance. To achieve the same levels of complications as with RARP, the annual volume of surgery would need to be three times greater for RRP and LRP, which demonstrates the greater expertise of robotic surgeons compared to surgeons performing the other techniques in the SRs.
Our study can be used as a tool to guide patients and physicians in deciding on the best surgical treatment according to availability. Future studies using the database we constructed for this study could provide information on other oncological and functional outcomes.  Acknowledgments: The authors acknowledge the institutions involved, the study patients, and those that provided care for them.
Data sharing statement: The data that support the findings of this study are available from the corresponding author on reasonable request.

Ethics considerations:
The authors certify that the study was performed under the ethics standards laid down in the 1964 Declaration of Helsinki and its later amendments or comparable ethics standards.